Gauging the importance of structural parameters for hyperfine coupling constants in organic radicals

The identification of fundamental relationships between atomic configuration and electronic structure typically requires experimental empiricism or systematic theoretical studies. Here, we provide an alternative statistical approach to gauge the importance of structure parameters, i.e., bond lengths, bond angles, and dihedral angles, for hyperfine coupling constants in organic radicals. Hyperfine coupling constants describe electron–nuclear interactions defined by the electronic structure and are experimentally measurable, for example, by electron paramagnetic resonance spectroscopy. Importance quantifiers are computed with the machine learning algorithm neighborhood components analysis using molecular dynamics trajectory snapshots. Atomic–electronic structure relationships are visualized in matrices correlating structure parameters with coupling constants of all magnetic nuclei. Qualitatively, the results reproduce common hyperfine coupling models. Tools to use the presented procedure for other radicals/paramagnetic species or other atomic structure-dependent parameters are provided.


Introduction
Machine learning techniques are increasingly applied in chemical sciences. 1 Applications can be roughly classied into predictions of properties based on atomic/electronic structure 2 and (statistical) evaluation of experimental results. 1 In the subeld of electron magnetic resonance, applications focus on the interpretation of acquired spectra, e.g., from continuouswave electron paramagnetic resonance (EPR), 3 double electron-electron resonance (DEER), 4 electron-nuclear double resonance (ENDOR), 5 or hyperne sublevel correlation spectroscopy (HYSCORE). 6 Most machine learning techniques are prediction models that can predict a numerical output value by regression or classication based on input values (features). To improve model stability and accuracy, the dimensionality of feature sets can be reduced through preprocessing, e.g., by transformation into a new basis or selecting features through ranking their importance in determining the output. Neighborhood component analysis (NCA) is one example for a non-paramagnetic feature selection algorithm introduced by Goldberger et al. 7 It was improved through introduction of a regularization term by Yang et al., 8 and validated in real-world applications. 9,10 This algorithm can be used for feature selection but also simply to provide a gauge for importance of a feature for a particular response, as will be exploited herein.
Frequently targeted spectroscopic parameters of radicals and paramagnetic centers are hyperne coupling constants, resulting from electron-nuclear interactions, that can be used to evaluate the underlying atomic and electronic structure. Hyperne couplings arise from a non-zero spin density distribution evoked by spin delocalization and polarization. In small systems, some origins of hyperne interactions based on, e.g., exchange interactions or hyperconjugation, were identied. [11][12][13][14] These mechanisms are oen described locally for simplicity; however, the entirety of the electronic structure dictated by the type and conguration of all atoms denes the spin density distribution and, thereby, hyperne coupling magnitudes.
Hyperne couplings are described by an anisotropic rank 2 tensor A with three principal components A x,y,z and tr(A) = A iso . 15 Since interpretation of experimental A x,y,z values is non-trivial, a comparison with calculated values from ab initio models is usually performed. [16][17][18][19] Sufficiently accurate calculations are oen possible, but require incorporation of electron correlation dened by the used electronic theory, solvation effects dened by the structural model, and dynamic contributions. Dynamics can be included by averaging computed hyperne coupling constants from molecular dynamics snapshots, 20,21 since molecular motion on timescales of fs is fast compared to the experimental time scale of ps to ns for EPR. Trajectory analyses generally show a non-Gaussian coupling constant distribution in histograms. 21,22 In principle, these trajectories and the computed coupling constants contain statistical information on structure-hyperne relations that are usually removed through the averaging, but will be exploited herein.
In this work, molecular dynamics trajectories are used to derive correlations between molecular structure and hyperne coupling constants of common organic radicals using machine learning. Correlations are quantied by feature weights calculated using the NCA algorithm. Features involve all bonds, angles, and dihedrals extracted from the trajectory's coordinates, and targeted responses are A x,y,z,iso . First, adequate simulation parameters are derived using the principle of magnetic equivalence. Secondly, example radicals are analyzed regarding structure-hyperne relations extracted by the algorithm. The discussion is complemented by considering the structure, spin density distribution, hyperne tensor orientations/magnitudes, and links to relations known in literature.

Density functional theory calculations
Density functional theory (DFT) calculations were conducted using ORCA 5.0.1. 23 Organic radicals were geometry optimized using the hybrid functional B3LYP 24,25 and def2-TZVP 26 basis sets. XYZ les are accessible in the data repository found at https://doi.org/10.26165/JUELICH-DATA/UH0LM2. The ORCA-MD module 27 was used to calculate ab initio molecular dynamics (MD) trajectories, with velocities initialized assuming a temperature of 900 K or, for temperature comparison, in a range of 150 to 2000 K. Initialized temperatures were maintained using a Nosé-Hoover chains 28-30 (NHC) thermostat with a time constant of 20 fs. Time steps of 0.5 fs were applied to address high-frequency hydrogen motion. For MD snapshots in random steps of 0.5-40 fs along the trajectory, hyperne coupling tensors for 1 H, 13 C, and 17 O were calculated using B3LYP and EPR-III (ref. 31) basis sets, disregarding contributions from spin-orbit coupling. Hyperne tensors are referenced to the g-tensor frame. Example ORCA input les are given in the ESI, Section F. †

Data processing workow
Further data processing and analysis was performed in MATLAB R2021b using the Statistics and Machine Learning Toolbox and Deep Learning Toolbox, involving (a) conversion of xyz coordinates into position-independent structure parameters comprising bonds, angles, and dihedrals, and (b) NCA to identify structure parameter importance for changes in hyper-ne coupling constants. The MATLAB scripts are available online (https://doi.org/10.26165/JUELICH-DATA/UH0LM2).
2.2.1 Structure parameter extraction. MD xyz trajectory les as outputted by ORCA are used for parameter extraction. The rst structure, which is the geometry optimized structure, is analyzed to yield atomic distances with an upper cutoff of 1.5 Å to only include chemical bonds. These bonds are assigned to atomic identiers since MD snapshots might comprise other non-chemical-bond distances <1.5 Å along the trajectory. Angles are extracted based on bond connectivities. Dihedrals are specied by a chain of four bonded atoms where one dihedral per set of two central atoms is dened. To use structure parameters as machine learning features, dihedral angles are transformed to cosine and sine values to account for rotational symmetry, i.e., circumvent problems with the parameter discontinuity at 360°= 0°. Bond angles are given in degrees to reduce the total number of features. Bonds are given in Å.
2.2.2 Neighborhood components analysis. The NCA 7,8 algorithm (fscnca in MATLAB, Statistics and Machine Learning Toolbox) is used to extract importance values of specic features for a resulting response. Features imply all extracted structure parameters as explained in Structure parameter extraction (2.2.1). Responses are A x , A y , A z , or A iso . As hyperne interactions are performed for all magnetic nuclei, importance values form a 3D matrix per molecule. The regularization parameter l was either optimized using the Limited memory Broyden-Fletcher-Goldfarb-Shanno 32 (LBFGS) algorithm and fourfold crossvalidation, or it was set to 0.05, as indicated. Features were individually standardized, i.e., transformed to numbers ranging from 0 to 1, ensuring comparability albeit using a single l.

Workow
The workow is schematized in Fig. 1. First, organic radicals (Table 1) are initialized as xyz structures, pre-optimized using force-elds, and ultimately optimized with density functional theory. These structures serve as input for ab initio molecular dynamics (MD) simulations using a simulation temperature of 900 K, chosen based on results presented in the section Parameter choice below. Up to 30 ps MD simulation trajectories were computed to provide sufficient statistics for the machine learning algorithm. To account for the non-linear increase of structure parameters with the number of atoms in a molecule, the MD step number lies between 20 000 (a10 ps) for the methyl and 60 000 (a30 ps) for the tryptophan-type radical, compromising between necessary statistics and computation resources. MD trajectories represent the computationally most demanding step, using more than 93% of the computational resources (Table S1 †). Along MD trajectories, snapshots were chosen for hyperne coupling tensor calculations. We deliberately chose not to calculate hyperne tensors for every step to reduce structural similarity between single data points. Because structure parameters oscillate predominantly according to a superposition of regular sine waves, 33 choosing snapshots in regular intervals will signicantly decrease specic structure parameter variations if the interval corresponds to the frequency of a normal mode. Thus, we use an evenly distributed randomized step interval of 1-80 for hyperne coupling calculations.

Neighborhood components analysis
The hyperne coupling constants can be described as output (response variable) given a dened set of intrinsic variables including distances (chemical bonds), angles, and dihedrals.
These structure parameters can serve as features for the machine learning algorithm NCA. 7,8 NCA mathematically maps the input features to the outputs by generating weights for each feature. The feature weights determine the numerical importance of each feature in this mapping. Intuitively, features with higher weights contribute more to the output than features with lower weights. For stochastically uctuating features, regularization is a strategy to prevent the model from overaccurately tracing these uctuations, which could lead to unphysical results that would be referred to as overtting. Regularization penalizes feature weights of excessive value by adding a term to the error estimation between the real outputs and the model prediction. In NCA, 8 the penalty term consists of the squared weights multiplied with a regularization parameter l. Herein, l = 0.05 is chosen based on results from the section Parameter choice below.
To improve the accuracy of machine learning algorithms, choice and quality of features are crucial. 34 Herein, features are selected as follows. In contrast to bond lengths being continuous variables, dihedrals are circular variables exhibiting a discontinuity, but can be converted to continuous sine and cosine values. Bond angles are not converted likewise since they are dened in the interval [0°, 180°] and thus, are pseudocontinuous. Another possibility to reduce the number of features is to consider overlapping dihedrals. Dihedrals are described by four chemically connected atomic positions. For a standard descriptive example like 1,2-dichloroethane, typically a single dihedral is used, assuming that the other atomic positions can be inferred by symmetry. However, for randomly moving atoms in molecular dynamics, all existing sets of four connected atoms need to be considered. Yet, we expect the additional dihedrals to be highly correlated. Hence, only one dihedral per two dening central atoms is kept as a feature. Before feeding the NCA algorithm, each feature is individually standardized to have zero mean and unit standard deviation, avoiding artefacts from different unit scales and variation magnitudes. This strategy for standardization is ideally suited for continuous and Gaussian-like distributed features, which are oen observed in MD trajectories, particularly for bonds and angles. 21,22,35,36 Other standardization strategies might be superior depending on the conformational energy landscape.
On the basis of the created feature set, NCA can be performed for an arbitrary response variable that depends on the features. Herein, we include hyperne coupling constants  A x,y,z,iso for all magnetic nuclei in a given organic radical. A simple example is the methyl radical with atomic labels, spin density, and hyperne tensor plot depicted in Fig. 2a-c, respectively. The importance matrix regarding the central 13 C and all three 1 H nuclei shows clear non-zero entries ( Fig. 2d and e). A iso correlations are also evident from visible inspection of the raw data plotted in Fig. 2f, conrming NCA-derived correlations. For the response A iso (H1), positive correlation with the angle H2C1H3 opposite of H1 and negative correlations with the bond length H1C1 and the sum of all bond angles are inferred. For A iso (C1), positive correlation with all bond lengths and negative correlation with the sum of all angles is inferred. Without discussing chemical and physical origins at this point, a set of symmetry arguments are evident for the matrix in Fig. 2d. Due to magnetic equivalence of H1, H2, and H3, the upper le 3 × 3 matrix characterizing bonds and the adjacent lower 3 × 3 matrix characterizing angles should ideally be centrosymmetric. Similarly, all hydrogen atoms should be equally important for A iso (C1), so importance values for bonds and angles should be identical. Deviations from these symmetry considerations, like the non-centrosymmetric entries with importance values below one, are assumed to stem from insufficient statistics along the MD trajectory.

Parameter choice
The chosen MD parameters have a large inuence on the dataset quality for NCA. Since the importance matrix elements are unknown a priori, symmetry arguments as introduced in the last section are exploited to gauge parameter suitability. More specically, for H atom symmetry, the rst three matrix columns from Fig. 2d are permuted to meet the symmetry arguments, i.e., the rst entry of nucleus HX, with X˛{1, 2, 3}, corresponds to the bond HXC1 and the fourth entry corresponds to the opposite angle not involving HX. Permutations are performed for A x,y,z , and the cumulative mean square error comparing each pair of the {H1, H2, H3} group is calculated. For C atom symmetry, the standard deviation of the rst three bond-associated entries and the following three angleassociated entries are calculated for A x,y,z , and the overall mean standard deviation is computed. Using these symmetry-based descriptors, the MD temperature T MD is investigated (Fig. 3, central panel, see also Fig. S1 †). For 0.03 # l # 0.12, a window for minimum error is evident from roughly 700-1200 K, both for H and C symmetry gauges. By increasing T MD , the amplitude of structure parameter variations increases concomitantly. For example, the maximum amplitude of bond stretching/compression is 0.08 Å at 150 K and 0.26 Å at 2000 K. Maximum amplitude of angle stretching/ compression is 7°at 150 K and 32°at 2000 K. Results suggest that in the low-to-intermediate T MD regime, an increase in T MD benets the accuracy of the model, because the numerical spread of features/responses and sampling of the conformational space is enhanced. However, in the intermediate-to-high T MD regime, the trend is reversed. Extreme structure parameters might lead to unforeseen high-temperature effects on the electronic structure, e.g. relatively strong hydrogen 1s orbital overlap for angles <90°observed for T MD = 2000 K, resulting in alterations or discontinuities of the structure-hyperne relationship. Deviations from realistic chemical structures using T MD = 1500 K were also reported 37 and above 2000 K pyrolysis of hydrocarbons occurs. 38 We note that importance values may be slightly dependent on T MD , which is however most likely of minor signicance for the semi-quantitative ndings presented herein (Fig. S1 †).
In Fig. 3 (right panel), the symmetry-based descriptors are used to investigate the dataset size N. For increasing N, statistics become better and correlations between atomic movements are averaged out. For high l values, errors are generally smaller because weakly correlated features are forced to an importance value of zero. Equilibration of the descriptors for this trajectory is achieved with around N z 350. Although minimization of the error is the ideal performance indicator, the importance matrix is already qualitatively reproduced with far smaller N; the underlying symmetry is indicated already for N = 25 (Fig. S1 †) where the absolute importance values still deviate signicantly from N = 505 (Fig. 2), but correlated or uncorrelated classication is already satisfying. Using rough extrapolation, dataset sizes were increased for increasing amounts of magnetic nuclei in a radical (Table 1). To compromise with available computational resources, the N-to-feature ratio decreases from 84 for the methyl radical to 24 for the tryptophan-type radical.
Lastly, the regularization parameter l needs to be chosen appropriately, with feature weights approaching zero for l / N, but a too small l over-interprets the data (overtting). l can be tuned by optimizing the accuracy of the NCA model using cross-validation, where parts of the data are not used to train the model but to later validate its accuracy. Prediction accuracy optimization (Fig. S2 †) for the methyl, ethyl, and methyl peroxy radicals was performed for each coupling constant individually, resulting in 0.01 # l # 0.26 with an average of 0.07 and a median of 0.03 (Table S2 †). The optimized l values are very similar for magnetically equivalent nuclei and vary for different coupling magnitudes and nucleus type. Based on these test sets and to ensure better comparability in importance matrices, l was chosen to be 0.05, in between the average and median for all gures in the main text. Additional gures with l = 0.025 and 0.100 for all radicals are provided in the ESI, Section E. †

Chemical implications
In this section, organic radicals (Table 1) are investigated as test cases for the introduced feature design and NCA algorithm. In analogy to the methyl radical, importance matrices can be cross-checked by symmetry arguments. On that basis, the relationship of chemical information with atomic and electronic structure will be extracted, since hyperne coupling constants directly result from the spin density distribution that is a function of the molecular orbital structure. The spin-orbit coupling contribution is disregarded in this work, as is common for organic radicals, 16 saving computational resources and removing selection biases since spin-orbit coupling is signicantly DFT-functional dependent. 39,40 To reduce complexity, we will largely base the discussion on A iso values.
3.4.1 Methyl radical. The methyl radical CH $ 3 is an experimentally 41 and theoretically 42,43 well-characterized organic radical with a carbon 2p z orbital centered free electron (Fig. 2b), largely dening the anisotropy and magnitude of the 13 C coupling tensor (Fig. 2c). In addition, spin density can spread over the molecule via spin polarization. For CH $ 3 , exchange interactions between the (p) 1   which explains less negative A x,y that is pointing toward the C atom and parallel to the 13 C hyperne z-axis, respectively.
The importance matrix shows that A iso (HX) is correlated with the respective HXC1 bond length, likely increasing negative spin density at the nucleus through growing overlap between (p) 1 and (s CH ) 2 , and thus increasing the exchange interaction. Analogously, A iso (C1) is affected by contributions from all bonds simultaneously. Regarding bond angles, A iso (HX) correlation with the opposite-lying angle is observed, e.g. H2C1H3 for A iso (H1), veried by symmetry arguments. Angles deviating from 120°likely alter the orbital conguration with respect to hybridization and inter-(s C1HX ) 2 overlap, where larger H-H distances were found to result in more negative A iso . 45 Consequently, also the two remaining angles should affect A iso . However, the opposite-lying angle is the only angle where both other H atoms either move further away or get closer concurrently to a reference 1 H nucleus. In contrast, if H1C1H2 got larger, H1C1H3 could get larger as well, amplifying the effect on A iso (H1), or it could get smaller, compensating the effect. For CH $ 3 , the most analyzed structural deformation in literature is a trigonal pyramidal distortion, characterized by the angle of the H atom's positions and the relaxed structure's plane. 43,46 To gauge this distortion, the sum of all angles was added as a feature herein and shows likewise correlation with all A iso (HX), evoked by moving 1 H away from the nodal plane toward the 2p z orbital lobes. Thus, A iso (HX) becomes less negative with growing pyramidal distortion (Fig. 2f). Concurrently, A iso (C1) becomes more positive because the p z -centered free electron transforms into a more sp 3 hybridized-type electron, in analogy to groundstate pyramidal CF $ 3 . 47 Regarding importance matrices of the full hyperne tensor, the H1C1 bond length has a greater effect on A x,y (H1) compared to A z (H1). All principal components are affected by the change in spin population at C and H atoms, but through-space contributions affect A x,y more signicantly, as reected above. For C1, A x,y,z importance values are not significantly distinguishable (Fig. S3 †).
3.4.2 Ethyl radical. The ethyl radical CH 3 CH $ 2 can be analyzed based on CH $ 3 with one H atom exchanged for a CH 3 group. The CH 2 fragment behaves analogous to CH $ 3 , exhibiting maximum spin population in the C1 p z orbital and negative spin density around H1 and H2 (Fig. 4a). Accordingly, non-zero importance values of H1, H2, and C1 are equivalent to the CH $ 3 case, taking into account that all C1-centered angles are now correlated, because C 3 rotational symmetry is absent and, therefore, each angle matters individually. Exchanging the H residue with CH 3 stabilizes the ethyl radical via hyperconjugation, 13,48 describing electron delocalization via adjacent s b-CH bonds lying above or below the H1C1H2 plane. This is manifested in large-magnitude 1 H hyperne tensors of H4 and H5 and insignicant coupling for H3 lying in the nodal plane in the relaxed structure. However, low-energy CH 3 rotation renders these atoms (b-H) magnetically equivalent along the MD trajectory. The involved large changes of A iso (b-H) depending on the dihedral angle can be identied by a strong correlation in the importance matrix, representing Karplus-like 14 behavior. Furthermore, non-zero angle importance values dene the tetrahedral-to-pyramidal distortion, altering the b-H distance to the p z orbital lobes. Interestingly, the C2C1 bond length is also signicant for the hyperne coupling of b-H. This can be attributed to negative spin density around C2 (Mulliken spin population of −0.08), possibly evoked by similar exchange interactions along the C2C1 bond as for H1C1. Further exchange interactions can result in spin polarization of b-H orbitals, as pointed out by Geoffroy and Lucken, 49 challenging the sole attribution of b-H hyperne values to McConnell-type 11,13,44 hyperconjugation. Accordingly, A iso (C2) depends strongly on the C2C1 bond but also on the C1C2b-H angles, which affects spin delocalization via hyperconjugation toward b-H, thereby offering another path for exchange interactions between b-H and C2.
3.4.3 Methyl peroxy radical. The methyl peroxy radical belongs to the class of alkyl peroxy radicals, which form upon autoxidation of organic compounds. CH 3 O $ 2 in particular is relevant for gas-phase atmospheric chemistry as an intermediate in the oxidative decomposition of methane. 50 Maximum spin density is located in oxygen p-bonded p z orbitals with Mulliken spin population of +0.69 and +0.29 for O2 and O1, respectively (Fig. 4b). The 17 O hyperne tensors exhibit consistent anisotropy and magnitude. Similar to CH 3 CH $ 2 , the p network is further stabilized by s CH hyperconjugation and Geoffroy/Lucken 49 exchange (vide supra), conrmed by large 1 H importance values regarding the dihedral angle and the C1O1 bond, respectively. In contrast to CH 3 CH $ 2 , 1 H hyperne coupling changes will not follow approximate C 2 symmetry with the dihedral, reecting p z symmetry, but are more complex due to increasing (dipolar) interactions when the dihedral approaches ecliptic HX and O2. In this position, the O2O1C1 angle dominates the distance of 1 H to maximum spin density affecting A iso (HX), as deducible from the importance matrix. Also, both 17 O coupling tensors strongly correlate with O2O1C1, indicating a signicant change of electronic structure, possibly involving facilitated hyperconjugation if O2O1C1 is small and/ or changes in the p network. A iso (C1) is comparably small and dictated by the O2O1 bond. This nding can be explained by a cascading effect; if O2O1 is short, p-bonding becomes more efficient, hence the spin population in the O1 p z orbital increases, and thus exchange interactions between the O1 p z and the C1O1 s-bond are larger. Although there is no correlation of A iso (O1) with O2O1 in the displayed importance matrix, an importance value of 1.6 is extracted for A z (O1) (Fig. S4 †), which is consistent with the interpretation.
3.4.4 Semiquinone radical. The family of semiquinones are representative aromatic radicals and serve as electron-transfer agents in biological systems. In comparison to small radicals analyzed thus far, 39 involved structure parameters dene the pbenzosemiquinone feature space, spanning a 46 × 13 importance matrix (Fig. 5), where machine learning becomes increasingly useful to reduce complexity. Oxygen and carbon atoms form a p network on which spin density is delocalized. The radical is stabilized by the OH group in para-position through its positive mesomeric (+M) effect. Mulliken spin population maxima are identied at O1 (+0.38) and at ring carbons in para-(+0.32) and ortho-position (+0.26) whereas meta-(−0.12) and ipso-positions (−0.12) are negative. The sign of spin populations is in agreement with simple valence-bond theory. 51 Ring hydrogens H1-H4 exhibit spin density sign reversal with regard to the directly bonded carbon atoms evoked by exchange interactions. 12,52 Largest A iso is observed for O1 and the importance matrix correlations suggest that its magnitude is predominantly affected by its inclusion into the aromatic ring system; via the next-nearest bonds C1O1, C2C1, and C3C1 and via its atomic positioning relative to the aromatic plane characterized by the dihedrals O1C1C3C5 and O1C1C2C4. Other deformations within the ring seem to be less impactful, in agreement with ndings that the degree of aromaticity remains fairly resistant against medium deformations therein. 53,54 A iso values of ring carbons are more complex to dene; the multitude of signicant structure parameters results in some uncertainty of the algorithm indicated by deviations from C2/C3 and C4/C5 equivalence. Nonetheless, more general statements can be made. In accordance with A iso (O1), the dihedrals involving O1 position relative to the ring also impact A iso (CX) by alteration of the degree of delocalization into the ring. For that, also the C1O1 bond seems to be signicant, especially for A iso (C1). Within the ring, C-C bond lengths dominate A iso (CX) via proposed sensitivity in orbital overlap and exchange, evoking alternating spin density signs around the ring. Inter-ring and ring-hydrogen angles are considered less impacting. Notably, the position of O2 dened by the angle O2C6C4 (and O2C6C5) seems to consistently affect A iso of ortho-and meta-carbons, possibly due to symmetry cancelation along C1C3C5C6 versus C1C2C4C6 leading to altered delocalization pathways. A clearer picture arises from ring-hydrogens where symmetry of H1/H3 and H2/H4 is largely intact. It is known that spin density at ring-hydrogens is predominantly caused by McConnell 11 exchange where A iso is proportional to the spin density at the adjacent C ring-atom. 12,52 Therefore, it is no surprise that C-C bond lengths also largely impact A iso of ring-hydrogens, which is also evident for H1 and H3 when using a smaller l (ESI, Section E †). Nonetheless, ring-hydrogens are more affected by their position above or below the nodal plane characterized by the correlation with appropriate dihedral angles.
The para-positioned OH group is the last fragment to be analyzed. O2 again is highly affected by the C1O1 bond, mediating the degree of delocalization originating from O1. Efficient delocalization is further depending on positioning above or below the aromatic plane, either on the spin density 'sending' (O1C1C2C4) or 'receiving' side (O2C6C5C3). H5 is positionally more exible with regard to the aromatic plane (C4C6O2H5), which has a large impact on A iso (H5). It is dominated by McConnell 11 exchange mediated by the H5O2 bond when within the plane and by participation in the p network above or below the plane.
In addition to the radicals discussed thus far, we also analyzed the tyrosyl radical, structurally similar to semiquinone, and tryptophan-type radical. The corresponding importance matrices are shown in Fig. S5 and S6, † respectively.

Suggestions for application
On a qualitative basis, the introduced importance matrices reveal a multitude of atomic-electronic structure relationships, which typically necessitate elaborate analyses of individual structural changes and their effect(s). [11][12][13][42][43][44][45]47,49,51,52,55 For small organic radicals, spin density distributions are oen conceptually simple to understand and partly transferrable. In contrast, interdependencies in more complicated radicals or paramagnetic species oen need to be specically investigated. For example, systematic computational studies were recently conducted to evaluate the electronic structure of transition metal phosphoroxy compounds, 35,56 which in part motivated this work. Finding relevant structure relations requires sound chemical intuition and time-consuming variational testing, where the holistic importance matrix approach could speed up the identication of structural relationships and prevent overlooked dependencies. Furthermore, molecular dynamics have oen been used to precisely predict A iso by averaging a conformational ensemble. 21,22,36,46,57,58 Apart from mimicking experimental conditions and computing averaged values, MD trajectories could also be concurrently used for the importance matrix approach, given that the needed inputs are already available. The presented procedure can be used for any system, provided hyperne coupling calculations can be performed with appropriate accuracy.
From a more practical point of view, importance matrices can help answer two types of questions for experimentalists that aid the structure identication of a paramagnetic molecule: (1) which hyperne coupling constants do we need to measure to estimate a specic structure parameter? With pre-experimental awareness of these correlations, EPR experiments can be planned using a design of experiments (DoE) approach. This might involve a decision on whether to investigate liquids or solids, which hyperne spectroscopy techniques are needed, and which atoms need to be isotope-labeled (e.g. 13 C, 14,15 N, 17

O). (2)
Which structure parameters could be responsible for a detected change in a specic hyperne coupling constant? Changes in structure parameters can have a multitude of reasons and are, therefore, frequently targeted with EPR spectroscopy. In (frozen) solutions, the solvent (or other interacting molecules) is oen responsible for structural changes, either implicitly through its dielectric continuum or explicitly, e.g., via hydrogen bonding. 59,60 Temperature was also identied to affect hyperne couplings, 61 although again through indirect solvent effects such as viscosity changes. Hyperne couplings can also be sensitive to structural changes rather remote to the actual radical center (e.g., dihedral-dependent side chain hyperconjugation in semiquinones 57,58 ), which might be disregarded in an analysis to reduce complexity/save computational resources or overlooked in the rst place.

Conclusions
We have described an application of the NCA feature selection algorithm to identify relations between atomic conguration and electronic structure in paramagnetic species. The statistical analysis is based on structure parameters and hyperne tensors of molecular dynamics snapshots, calculated by density functional theory. While conducting MD simulations at experimentally accessible temperatures would be desirable, results suggest that the semi-quantitative information of an importance matrix can be retained at elevated temperatures around 700-1200 K. Adjusting the temperature helps balancing computational costs, and it may also be helpful for sampling a larger congurational space within a particular MD trajectory, as long as the structure of a molecule is not qualitatively altered. Assigning importance quantiers to structural parameters for A x,y,z,iso creates a new perspective to look at, analyze, and understand organic radicals. For the well-characterized methyl and ethyl radicals, importance values reect known (p) 1 and (s CH ) 2 exchange interactions and hyperconjugation with b-s CH bonds. For larger radicals, more subtle correlations are identi-ed, for example that A iso (C) in the methyl peroxy radical is predominantly dictated by the oxygen-oxygen bond length, or that either exchange or p-delocalization dictates A iso of the hydroxy-hydrogen in p-benzosemiquinone, depending on the dihedral angle.
The computed importance matrices can help experimentalists in selecting the most sensitive hyperne couplings for the assessment whether a specic structure parameter is changing. For example, changes can occur via H-bonding, which is known to lengthen the C-O bond in p-benzosemiquinone. 57 A sensitive hyperne coupling constant can also help in identifying conned conformational distributions, as shown for biological tyrosyl and tryptophan radicals. 62,63 To incorporate all relevant effects, the analysis should be repeated including the explicit molecular environment, e.g., including solvent molecules. We encourage the reader to use and modify the implemented procedure written for ORCA and MATLAB and note that, in principle, also other structure-dependent atomic constants such as quadrupole couplings, chemical shi tensors, etc. or isotope variations are suitable for the presented procedure.

Data availability
The codes used for ORCA calculations are included in the ESI, Section F. † Matlab scripts/functions and trajectory data of the organic radicals investigated herein can be found at https:// doi.org/10.26165/JUELICH-DATA/UH0LM2 or at https:// github.com/conradsz/StructureHyperneRelations.